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In this paper, we investigate the problem of nonparametric monotone frontier estimation from 
the perspective of extreme value theory. This enables us to revisit the asymptotic theory of the 
popular free disposal hull estimator in a more general setting, to derive new and asymptotically 
Gaussian estimators and to provide useful asymptotic confidence bands for the monotone bound- 
ary function. The finite-sample behavior of the suggested estimators is explored via Monte Carlo 
experiments. We also apply our approach to a real data set based on the production activity of 
the French postal services. 
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1. Introduction 

In production theory and efficiency analysis, there is sometimes the need to estimate the 
boundary of a production set (the set of feasible combinations of inputs and outputs). 
This boundary (the production frontier) represents the set of optimal production plans 
so that the efficiency of a production unit (a firm, for example) is obtained by measuring 
the distance from this unit to the estimated production frontier. Parametric approaches 
rely on parametric models for the frontier and the underlying stochastic process, whereas 
nonparametric approaches offer much more flexible models for the data-generating pro- 
cess (see, for example, [4] for recent surveys on this topic). 

Formally, in this paper, we consider technologies where x € R+, a vector of production 
factors (inputs) is used to produce a single quantity (output) y G K + . The attainable pro- 
duction set is then denned, in standard microeconomic theory, as T = {(x, y) e R?j_ x R + | 
x can produce y}. Assumptions are usually made on this set, such as free disposability of 
inputs and outputs, meaning that if (x,y) £ T, then (x',y') € T for any [x 1 ,y') such that 
x' > x (this inequality must be understood componentwise) and y' < y. To the extent 
that the efficiency of a firm is a concern, the boundary of T is of interest. The efficient 
boundary (or production frontier) of T is the locus of optimal production plans (maxi- 
mal achievable output for a given level of inputs). In our setup, the production frontier 
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is represented by the graph of the production function 4>(x) = sup{y | (x, y) € T}. The 
economic efficiency score of a firm operating at the level (x, y) is then given by the ratio 
4>{x)/y. 

Cazals et al. [2] proposed a probabilistic interpretation of the production frontier. 
Let T be the support of the joint distribution of a random vector (X, Y) S R^ x R + 
and let (Cl,A,F) be the probability space on which the vector of inputs X and the 
output Y arc defined. The distribution function of {X, Y) can be denoted F(x, y) and 
F(-\x) = F(x, ■)/ Fx{x) will be used to denote the conditional distribution function of Y 
given X < x, with Fx{x) = F(x, oo) > 0. It has been proven in [2] that 

tp(x)=saj?{y>0\F(y\x)<l} 

is a monotone non-decreasing function with x. So, for all x' > x with respect to the 
partial order, <p(x') > <p(x). The graph of <p is the smallest non-decreasing surface which 
is greater than or equal to the upper boundary of T. Further, it has been shown that 
under the free disposability assumption, (p = <f>, that is, the graph of ip coincides with the 
production frontier. 

Since T is unknown, it must be estimated from a sample of i.i.d. firms X n = {(X;, Yi) \ 
i = 1, . . . , n}. The free disposal hull (FDH) Tfdh = {(x, y) G R^ +1 | y <Yi,x > Xi,i = 
1, . . . ,n} of X n was introduced by [7]. The resulting FDH estimator of tp(x) is 

tpi(x) = sup{y > | F(y\x) < 1} = max Y it 

i : Xi<x 

where F(y\x) = F n (x, y)/F x {x) with F n {x,y) = (l/n)^! 1 ^ < x,Y t < y) and 
Fx(x) = F n (x,oo). This estimator represents the lowest monotone step function cov- 
ering all of the data points (Xi,Yi). The asymptotic behavior of ^i(x) was first derived 
by [13] for the consistency and by [12, 14] for the asymptotic sampling distribution. 
To summarize, under regularity conditions, the FDH estimator (p\ (x) is consistent and 
converges to a Wcibull distribution with some unknown parameters. In Park et al. [14], 
the obtained convergence rate n~ 1 ^ p+1 ' > requires that the joint density of (X,Y) has 
a jump at its support boundary. In addition, the estimation of the parameters of the 
Weibull distribution requires the specification of smoothing parameters and the resulting 
procedure has very poor accuracy. In Hwang et al. [12], the convergence of <fii(x) to the 
Weibull distribution was established in a general case where the density of (X, Y) may 
decrease to zero or increase toward infinity at a speed of power ft {j3 > — 1) of the distance 
from the frontier. They obtain the convergence rate n _1 ^ /3+2 - ) and extend the particular 
result of Park et al. [14] where ft = 0, but their result is only derived in the simple case 
of one-dimensional inputs (p= 1), which may be of less interest in practice. 

In this paper, we first analyze the properties of the FDH estimator from an extreme 
value theory perspective. In doing so, we generalize and extend the results of Park et 
al. [14] and Hwang et al. [12] in at least three directions. First, wc provide the necessary 
and sufficient condition for the FDH estimator to converge in distribution and we specify 
the asymptotic distribution with the appropriate rate of convergence. We also provide a 
limit theorem for moments in a general framework. Second, we show how the unknown 
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parameter p x > 0, involved in the necessary and sufficient extreme value conditions, is 
linked to the dimension p + 1 of the data and to the shape parameter j3 > — 1 of the joint 
density: in the general setting where p > 1 and j3 = j3 x may depend on x, we obtain, under 
a convenient regularity condition, the general convergence rate n _1 / px = n _1 /(' 3 =+P+ 1 ) of 
the FDH estimator ifii(x). Third, we suggest a strongly consistent and asymptotically 
normal estimator of the unknown parameter p x of the asymptotic Weibull distribution of 
tfi (x). This also answers the important question of how to estimate the shape parameter 
/3 X of the joint density of (X, Y) when it approaches the frontier of the support T. 

By construction, the FDH estimator is very non-robust to extremes. Recently, Aragon 
et al. [1] constructed an original estimator of <f(x), which is more robust than ipi(x), 
but which keeps the same limiting Weibull distribution as ifi(x) under the restrictive 
condition /3 = 0. In this paper, we provide further insights and generalize their main 
result. We also suggest attractive estimators of tp(x) converging to a normal distribution, 
which appear to be robust to outliers. The paper is organized as follows. Section 2 presents 
the main results of the paper. Section 3 illustrates how the theoretical asymptotic results 
behave in finite-sample situations and gives an example with a real data set on the 
production activity of the French postal services. Section 4 concludes the paper, with 
proofs deferred for the Appendix. 

2. The main results 

From now on, we assume that x £ such that Fx (x) > and will denote by ip a (x) and 
(p a (x), respectively, the a-quantiles of the distribution function F(-\x) and its empirical 
version F(-\x), 

<Pa(%) = inf{y > | F(y\x) > a} and <p a (x) = inf{y > | F{y\x) > a} 

with a € ]0, 1]. When a 1 1, the conditional quantile <p a (x) tends to ip\ (x), which coincides 
with the frontier function <p(x). Likewise, ip a (x) tends to the FDH estimator <pi(x) of 
ip(x) as a 1 1. 

2.1. Asymptotic Weibull distribution 

We first derive the following interesting results on the problem of convergence in dis- 
tribution of suitably normalized maxima b~ 1 ((pi(x) — <p(x)). We will denote by T(-) the 
gamma function. 

Theorem 2.1. (i) If there exist b n > and some non- degenerate distribution function 
G x such that 

b-\0 l {x)- V >{x))^G x , (2.1) 

then G x (y) coincides with ^ Px (y) — cxp{ — (— y) PT } with support ]— oo, 0] for some p x > 
0. 
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(ii) There exists b n > such that b n 1 (ifi(x) — <p(x)) converges in distribution if and 
only if 

lim {1 - F(<p(x) -l/tz\ x)}/{1 - F(<p(x) -l/t\x)} = for all z>0 

t-^-QO 

(2.2) 

(regular variation with exponent — p x , notation 1 — F(ip(x) — t \ x) £ RV - Px ). 
In this case, the norming constants b n can be chosen as b n = tp(x) — <Pi-(i/ n F x (x))i x ) ■ 

(iii) Given (2.2), lim„^ 00 E{6~ 1 (( y 9(x) - tp 1 (x))} k = T(l + kp^ 1 ) for all integers k > 1 
and 

(pi(x) -E(yi(x)) 
{Varifcix))} 1 / 2 ~ y _ 

= [{r(i + 2P- 1 ) - r 2 (i + p- 1 )} 1 /^ r(i + P -% 

Remark 2.1. Since the function 1 1-> Fx(a;)[l — F(ip(x) — \ \ x)] £ RV- P:r (regularly 
varying in t — > oo) by (2.2), this function can be represented as t~ pm L x (t) with L x (-) £ 
RVq (L x being slowly varying) and so the extreme value condition (2.2) holds if and 
only if we have the following representation: 

F x {x)[l-F{y\x)]=L x {{ V {x)-y}- l ){ip{x)-y) p * as (2.3) 

In the particular case where L x ({(p(x) — y} _1 ) = t x is a strictly positive function in x, it 
is shown in the next corollary that b n ~ (n£ x )~ 1 / p:c . From now on, a random variable W 
is said to follow the distribution Wcibull(l, p x ) if W Pm is exponential with parameter 1. 

Corollary 2.1. Given (2.3) or, equivalently, (2.2) with L x ({ip(x) — y}^ 1 ) =£ x >0, we 
have 

(nix) 1 /?* ((p{x) — <fi(x)) — > Weibull(l, p x ) as n — > oo. 

Remark 2.2. Park et al. [14] and Hwang et al. [12] have obtained similar results under 
more restrictive conditions. Indeed, a unified formulation of the assumptions used in 
[12, 14] can be expressed as 

f{x,y) = c x {ip{x)-yY + o{{ip{x) - y} p ) asytf(x), (2.4) 

where f(x,y) is the joint density of (X,Y), f3 is a constant satisfying j3 > — 1 and c x is a 
strictly positive function in x. Under the restrictive condition that / is strictly positive 
on the frontier (that is, ,8 = 0), Park et al. [14], among others, have obtained the limiting 
Weibull distribution of the FDH estimator with the convergence rate rt -1 /^ 1 * 1 . When 
(3 may be non-null, Hwang et al. [12] have obtained the asymptotic Weibull distribution 
with the convergence rate n -1 / (^+ 2 ) in the simple case p = 1 (here, it is also assumed that 
(2.4) holds uniformly in a neighborhood of the point at which we want to estimate <£>(•), 
and that this frontier function is strictly increasing in that neighborhood and satisfies 
a Lipschitz condition of order 1). In the general setting where p> 1 and /? = j3 x > — 1 
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may depend on x, we have the following, more general, result, which involves the link 
between the tail index p x , the data dimension p + 1 and the shape parameter (3 X of the 
joint density near the boundary. 

Corollary 2.2. If the condition of Corollary 2.1 holds with F(x,y) being differentiate 
near the frontier (that is, l x > 0, p x > p and tp(x) are differentiate in x with first partial 
derivatives of ip(x) being strictly positive), then (2.4) holds with f3 = f3 x = p x — (p + 1) 
and we have 

(n4) 1/(/3x+p+1) (<p(x) - <pi (x)) -A Weibull(l, f3 x +p + 1) as n -> oo. 

Remark 2. 3. We assume the differentiability of the functions £ x , p x with p x > p and 
(p(x) in order to ensure the existence of the joint density near its support boundary. 
We distinguish between three different behaviors of this density at the frontier point 
(x, (p(x)) £ R p+1 based on how the value of p x compares to the dimension (p + 1): when 
Px > p + 1, the joint density decays to zero at a speed of power p x — (p+ 1) of the distance 
from the frontier; when p x ~p + 1, the density has a sudden jump at the frontier; when 
px < V + lj the density increases toward infinity at a speed of power p x — (p + 1) of 
the distance from the frontier. The case p x < p + 1 corresponds to sharp or fault-type 
frontiers. 

Remark 2-4- As an immediate consequence of Corollary 2.2, when p = 1 and f3 x = f3 (or, 
cquivalcntly, p x = p) does not depend on x, we obtain the convergence in distribution of 
the FDH estimator, as in Hwang et al. [12] (sec Remark 2.2), with the same convergence 
rate n - 1 /( /3 + 2 ) (in the notation of [12], Theorem 1, fx(x) = t x {(3 + 2)(p'{x) = t x p x ip'{x)). 
In the other particular case where the joint density is strictly positive on the frontier, we 
achieve the best rate of convergence n~ 1 / t ^ p+1 \ as in Park et al. [14] (in the notation of 

Theorem 3.1 in [14], p NW ,o/y = il /(p+1) = tl lp '). 

Note, also, that the condition (2.4) with /3 = j3 x > — 1 (as in Corollary 2.2) has been 
considered by [8, 10, 11]. In Section 2.3, we answer the important question of how to 
estimate the shape parameter (3 X in (2.4) or, equivalcntly, the regular variation exponent 
Px in (2.2). 

As an immediate consequence of Theorem 2.1(iii) in conjunction with Corollary 2.2, 
we obtain 

EMx) - ^(x)} k = k{(3 x +p + lj-^r^+p+Dr^A, + P + 1}- 1 ) 

(2.5) 

+ o(n- fe ^ +p+1 >). 

This extends the limit theorem of moments of Park et al. ([14], Theorem 3.3) to the 
more general setting where (3 X may be non-null. Likewise, Hwang et al. ([12], Remark 
1) provide (2.5) only for k E {1,2}, p — 1 and (3 X = j3. The result (2.5) also reflects the 
well-known curse of dimensionality from which the FDH estimator (p\{x) suffers as the 
number p of inputs-usage increases, as pointed out earlier by Park et al. [14] in the 
particular case where (3 X = 0. 
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2.2. Robust frontier estimators 

By an appropriate choice of a as a function of n, Aragon et al. [1] have shown that tp a (x) 
estimates the full frontier (p(x) itself and converges to the same Weibull distribution 
as the FDH <fi(x) under the restrictive conditions of [14]. The next theorem provides 
further insights and generalizes their main result. 

Theorem 2.2. 

(i) If b~ 1 (tf>i(x) — <p{x)) — > G x , then for any fixed integer k > 0, 



KH0i-k/( n fi x (*))( x )-<p( x ))—> H * 



as n — > oo 



for the distribution function H x (y) = G x (y) X)i=o( — l°gG' x (?/)) l /i!. 

(ii) Suppose that the upper bound of the support ofY is finite. If b^ 1 (<pi(x) — ip(x)) — 

G x , then b~ 1 {tp an (x) — tp(x)) — — > G x for all sequences a„ — > 1 satisfying ?ife r ^ 1 (l — 
a„)->0. 

Remark 2.5. When (p\[x) converges in distribution, the estimator (p an (x), for a n := 
1 — k/nFx(x) < 1 (that is, k = 1, 2, . . ., in Theorem 2.2(i)), estimates <p{x) itself and also 
converges in distribution, with the same scaling, but a different limit distribution (here, 
n6~ 1 (l — a n ) oo). To recover the same limit distribution as the FDH estimator, it 
suffices to require that a n — > 1 rapidly so that nb~ (1 — a n ) — > 0. This extends the main 
result of Aragon et al. ([1], Theorem 4.3), where the convergence rate achieves n _1 /( p+1 ) 
under the restrictive assumption that the density of (X, Y) is strictly positive on the 
frontier. Note, also, that the estimate (p an does not envelop all of the data points providing 
a robust alternative to the FDH frontier ipi ; see [3] for an analysis of its quantitative and 
qualitative robustness properties. 



2.3. Conditional tail index estimation 

The important question of how to estimate p x from the multivariate random sample X n 
is very similar to the problem of estimating the so-called extreme value index, which 
is based on a sample of univariate random variables. An attractive estimation method 
has been proposed by [15], which can be easily adapted to our conditional approach: 
let k = k n be a sequence of integers tending to infinity and let k/n — > as n — > oo. A 
Pickands-typc estimate of p x can be derived as 

~ _ t < ^l-(2fc-l)/(nFx( a: ))( X ) ~ V\-(ik-\)/(nF x {x))( X ) \~ 1 

V <Pl-(k-l)/(nF x (x))( X ) ~ <Pl-(2k-l)/{nF x (x))( X ) ) 

The following result is particularly important since it allows the hypothesis p x > to be 
tested and will later be employed to derive asymptotic confidence intervals for ip(x). 
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Theorem 2.3. (i) If (2.2) holds, k n — > oo and k n /n —> 0, then p x — > p x . 

(ii) If (2.2) holds, k n /n — > and fc n /loglogn — > oo, then p x -°^> p x . 

(iii) Assume that U{t) := (pi-i/(tF x (x))i x )i t> p^/x) > ^ as a P 0S Hi ve derivative and 
that there exists a positive function A(-) such that for z > 0, lim t _ >00 {(tz) 1+1 / px U'(tz) — 
t 1+1 /P°>U'(t)}/A(t) = ±log(z), for either choice of the sign (Tl-variation, which 
will in the sequel be denoted by: ±t 1+1 / p *U' (t) G 11(A)). Then, 

a/MAt- p x )^M{{),a 2 { Px )), (2.6) 

with asymptotic variance cr 2 (p x ) = p 2 (2 1_2 / px + 1)/{(2 _1 / Px - 1) log4} 2 , for k n -} 
oo satisfying k n — o(n/<7~ 1 (n)), where g~ x is the generalized inverse function of 
g{t)=t 3+2 /r*{U'(t)/A(t)} 2 . 

(iv) //, for some k > and 5 > 0, the function {t Px ~ 1 F'(ip(x) — j \ x) — 5} G RV _ re , 
then (2.6) holds with g(t) = t 3+2 /r* {U'(t)/(t 1+1 ^-U'(t) - [SFxix)]' 1 ^' {px) 1 '"^ 1 )} 2 ■ 

Remark 2.6. Note that the second order regular variation conditions (iii) and (iv) of 
Theorem 2.3 are difficult to check in practice, which makes the theoretical choice of the 
sequence {k n } a hard problem. In practice, in order to choose a reasonable estimate 
Px(k n ) of p x , one can construct the plot of p x , consisting of the points {{k, p x (k)), 1 < 
k < nFx(x)/4}, and select a value of p x at which the obtained graph looks stable. This 
technique is known as the Pickands plot in the univariate extreme value literature (see, 
for example, [17] and the references therein, Section 4.5, pages 93-96). This is this kind 
of idea which guides the automatic data-driven rule we suggest in Section 3. 

We can also easily adapt the well-known moment estimator for the index of a univariate 
extreme value distribution (Dckkers et al. [6]) to our conditional setup. Define 

j fc-i 

M n ° = u X)O°601-»/(n*ir («))(*) " lo S£l-fc/(n# x (*))(*))' 
for each j = 1,2 and k = k n < n. 

We can then define the moment-type estimator for the conditional regular- variation ex- 
ponent p x as 

p. = + 1 - |[i - (a/WjVmW]" 1 }" 1 . 

Theorem 2.4. (i) If (2.2) holds, k n /n — > and k n — > oo, then p, x — — > p x . 

(ii) If (2.2) holds, k n /n — > and k n /(\ogn) s — > oo for some 8 > 0, then p x ^> p x . 

(iii) If±t 1 /P*{tp{x) - U(t)} G U(B) for some positive function B, then \fk^ l (p x — p x ) 
has, asymptotically, a normal distribution with mean zero and variance 

{2 + Px ) | {ll + 5 Px )(2 + p x ) \ 
{3 + p x ) (3 + p x )(4 + p x ) J 



Px(2 + Px)(l+Pxf 
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for k„ — > oo satisfying k n = o(n/g 1 (n)), where g(t) = t 1+2 / px \{\ogip{x) — 
\ogU(t)}/B(t)} 2 . 

Remark 2. 7. Note that the II- variation condition ±t 1+1/ p °> U'(t) G II of Theorem 2.3(iii) 
is equivalent to ±(t 1 ^ p "{if(x) — U(t)})' G RV-i, following Theorem A. 3 in [5], and that 
this equivalent regular-variation condition implies that ±± r lP* {tp(x) — U(t)} G II, accord- 
ing to [16], Proposition 0.11(a), with auxiliary function B(t) = ±t{t 1/p * {tp(x) - U(t)})'. 
Hence, the condition of Theorem 2.3(iii) implies that of Theorem 2.4(iii). Note, also, 
that a result similar to Theorem 2.4(iii) can be stated under the conditions of Theorem 
2.3(iv). 

2.4. Asymptotic confidence intervals 

The next theorem enables the construction of confidence intervals for tp(x) and for high 
quantile-type frontiers (pi_ Pn /p x / x )(x) when p n — > and np n —> oo. 

Theorem 2.5. 

(i) Suppose that F(-\x) has a positive density F'(-\x) such that F'((f(x) — j- \ x) G 
RVi-p x . Then, 

^1-^-1) /{nF x {x))\ X > i Pl-(2k n -l)/(nF x (x))\ X ) 

where V\{p x ) — ( o~ 2 2 1 ~ 2 /' , */(2~ 1 / px — l) 2 , provided that p n — > 0, np n — > oo and 

(ii) Suppose that the conditions of Theorem 2.3(iii) or (iv) hold, and define 

ifil(x) := (2 1/P * - ^)~ 1 {'P l _ {hn _ 1)/{nFx{x)) {x) ~ <fl-(2k n -l)/(nF x (x))( X )} 
+ ( Pl-(k n -l)/{nF x (x))( X )- 

Then, putting V 2 (p x ) = 3p x 2 2- 1 - 2 / p - /(2~ 1 / px - l) 6 , we have 

V2k n - : pr — > M (0, V 2 {p x )). 

^l-(fc n -l)/(nF x (x))W l Pl-(2k rl -l)/(7iF x (x))\ X ) 

(hi) Suppose that the conditions of Theorem 2.3(iii) or (iv) hold, and define 

(p\{x) := (2 1 /"- - l)~H01-(* B -l)/(„*. JC (x))( a O - ^l-(2k n -l)/(nF x (x))( X )} 
+ &l-(k n -l)/(nF x (x))(. X )- 

Then, putting V 3 (p x ) = p- 2 2- 2 / p ^ /(2~ 1 / p ' - l) 4 , we have 

(p\(x) - (p{x) 



<Pl-(k n -l)/(nF x (x))( X ) ^l-(2fe„-l) /(nF x (x))( X ) 
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(2.7) 

{^l-(fe„-l)/(n#x(x))( a; ) ~ ( Pl-(2k n -l)/(nP x (x)) ( X )}/ j ^jT* 7 ' ( 2fc^) } 

A Pl (l-2-*.). 

Remark 2.8. Note that Theorem 2.5(h) is still valid if the estimate p x is replaced by 
the true value p Xl up to a change of the asymptotic variance. It is easy to see that 
^(Pz) > Vs(Px) and so the estimator (p\{x) of <p(x) is asymptotically more efficient than 
0\{x). We also conclude from (2.7) that <fl(x) and <p\{x) have the same rate of conver- 
gence, namely nil' (nip) / (2fc„) 3 / 2 . In the particular case where L x ({(p(x) — y}^ 1 ) = £ x in 
(2.3), we have U'(^) = ^(^) 1/fe (^) 1+1/Pa: • Note, also, that in this particular case, 
the condition of Theorem 2.5(i) holds, that is, F'(<p(x) - \ | x) = j^ijY*' 1 G RVi- Px . 
However, the conditions of Theorem 2.3 (hi) and (iv) do not hold since both functions 
t \+i/ Pa , u i^ ) = j_(i_y./P* and tP"- 1 F , (<p(x) - \ | x) = ^fe are constant in t. Never- 
theless, the conclusions of Theorem 2.3(iii) and (iv) hold in this case for all sequences 
k n — » oo satisfying ^ — s- 0. The same is true for the conclusion of Theorem 2.5(h). 

Theorem 2.6. If the condition of Corollary 2.1 holds, k n — > oo and h n fn — y as n — y oo , 
then 

{pxk^/ikJn^flP^^^y^^ix) + (k n /n£ x y/^ - <p{x)\ 

—}Af(0,l) as n — > oo. 

Remark 2.9. The optimization of the asymptotic mean-squared error of 
fii-fk -lj/KxWl^ * s n0 ^ an a PP r0 P r i a te criteria for selecting the optimal k„ since 
the resulting value of k n does not depend on n. 

We shall now construct asymptotic confidence intervals for both ip(x) and </?i— p„/F x (:£)(x), 
using the sums M^ 1 ' and 

Theorem 2.7. 

(i) Under the conditions of Theorem 2.5{i), 

S7- t Pl-k n /(nF x (x))( X ) ~ <Pl-Pn/Fx(x)( x ) d. 



M ™<Pl-k„/(nF x {x))( x ) 



where V±{px) = (1 + 1/ p x ) 2 , provided that p n — > 0, np n — > oo and k n = [np n ]. 
(ii) Suppose that the conditions of Theorem 2.4(iii) hold and that U(-) has a reg- 
ularly varying derivative U' € RV - Px ■ Define the moment estimator <p(x) = 
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£i-k n /(nF x (,)) Wi 1 + M n ] (1 +P*)}- Then, 



tp(x)-if(x) 



M£>{l + l/p x )^_ knl{n p x{x)) {x) 



■Af(0,V 5 ( Px )), 



V 5 (p x ) = P 2 



Px 



(2 + p x ) 
4ft 



+ p x (2 + p x ){4-8 



, {2 + p x ) (U + 5p x )(2 + p x ) 
'(3 + p x ) (3 + ^(4 + ^) 



(3 + ^) 



2.5. Examples 



Example 2.1. We consider the case where the support frontier is linear. We choose 
(X,Y) uniformly distributed over the region D = {(x, y) | < x < 1, < y < x}. In this 
case (see, for example, [3]), it is easy to see that (p(x) = x and Fx(x)[l — F(y\x)] = 
(<f(x) — y) 2 for all < y < tp(x). Thus, L x (-) =£ x = l and p x = 2 for all x. Therefore, the 
conclusions of all Theorems 2.1-2.6 hold (see Remark 2.8). 



Example 2.2. We now choose a nonlinear monotone upper boundary given by the 
Cobb-Douglas model Y = X 1 / 2 cxp(— U), where X is uniform on [0,1] and U : inde- 
pendent of X, is exponential with parameter A = 3 (sec, for example, [3]). Here, the 
frontier function is tp(x) — x 1 / 2 and the conditional distribution function is F(y\x) = 
2>x~ 1 y 2 — 2a; -3 / 2 ?/ 3 for < x < 1 and < y < (p(x). It is then easily seen that the 
extreme value condition (2.2) or, equivalcntly, (2.3) holds with p x — 2 and L x (z) = 
F x (x)[Z<p(x) - ^}/[<p(x)] 3 for all ireJO,!] and z > 0. 



3. Finite-sample performance 

The simulation experiments of this section illustrate how the convergence results work 
in practice. We also apply our approach to a real data set on the production activity of 
the French postal services. 



3.1. Monte Carlo experiment 

We will simulate 2000 samples of size n = 5000 according the scenario of Example 2.1 
above. Here, <p{x) = x and p x = 2. Denote by N x — nFx(x) the number of observations 
(Xi,Yi) with Xi < x. By construction of the estimators p x and tp\{x), the threshold 
k n {x) can vary between 1 and N x /4. For the estimator with known p x and ip*(x), k n (x) 
is bounded by N x /2 and, finally, for the moment estimators p x and <p{x), the upper 
bound for k n (x) is given by N x — 1. So, in our Monte Carlo experiments for the Pickands 
estimator, k n (x) was selected on a grid of values determined by the observed value of N x . 
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We choose k n {x) = [N x /4] — k + 1, where k is an integer varying between 1 and [N x /4\. In 
the tables below, N x is the average value observed over the 2000 Monte Carlo replications. 
The tables display the values of k n (x), which is the average of the Monte Carlo values 
of k n (x) obtained for a fixed selection of values of k. For the moment estimators, the 
upper values of k n (x) were chosen as N x — 1. The tables display only a part of the results 
to save space, but in each case, we typically choose a set of values of k that includes 
not only the most favorable cases, but also covers a wide range of values for k n {x). 
These tables provide the Monte Carlo estimates of the bias and the mean-squared error 
(MSE) of the various estimators computed over the 2000 random replications, as well 
as the average lengths and the achieved coverages of the corresponding 95% asymptotic 
confidence intervals. They display only the results for x ranging over {0.25,0.5,1}, to 
save space. 

We will first comment on the results obtained for the Pickands estimators and for 
the estimator of <p(x) obtained with the knowledge that p x = p + 1 = 2 (the jump of 
the joint density of (A, Y) at the frontier) ; these results are displayed in Tables 1 and 
2. We observe that the Pickands estimates p x and <fit(x) behave much better when the 
sample size N x increases, although the convergence is rather slow. In contrast, even with 
the smallest sample size N x (for x = 0.25), the estimator (p\(x) computed with the true 
value of p x = 2 provides remarkable estimates of <p(x) and is rather stable with respect 
to the choice of k n (x). We see the improvement of <p*(x) over the FDH in terms of the 
bias, without significantly increasing the MSE. The achieved coverages of the normal 
confidence intervals obtained from <p*(x) are also quite satisfactory and much easier to 
derive than those obtained from the FDH estimator. As soon as N x is greater than 1000, 
all of the estimators provide reasonably good confidence intervals of the corresponding 
unknown, with quite good achieved coverages. In these cases (N x > 1000), we also observe 
some stability of the results with respect to the choice of k n {x). 

We now turn to the performances of the moment estimators p x and <p{x). The results 
are displayed in Table 3. Note that we used the same seed in the Monte Carlo experiments 
as the one used for the preceding tables. Compared with the Pickands estimators p x and 
<Pi(x), we observe here much more reasonable results in terms of the bias and MSE of 
the estimators p x and tp(x). In addition, when N x increases, the results are much less 
sensitive to the choice of k n (x) than for the Pickands estimators. We also observe that the 
most favorable values of k n {x) for estimating p x and ip(x) are not necessarily in the same 
range of values. We note that the confidence intervals for p x achieve quite reasonable 
coverage as soon as N x is greater than, say, 1000. However, the results for the confidence 
intervals of ip(x) obtained from the moment estimator <p(x) are very poor, even when 
N x is as large as 5000. A more detailed analysis of the Monte Carlo results allows us to 
conclude that this comes from an under-evaluation of the asymptotic variance of tp(x) 
given in Theorem 2.7. Indeed, in most of the cases, the Monte Carlo standard deviation 
of <p(x) was larger than the asymptotic theoretical expression by a factor of the order 
2-5 when N x equalled 1250, and by a factor of the order 1.3—1.7 when it equalled 5000. 
So, the poor behavior seems to improve slightly when N x increases, but at a very slow 
rate. 

We could say that using the Pickands estimators p x and tp*(x) is only reasonable in 
our setup when N x is larger than, say, 1000. These estimators are highly sensitive to the 
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Table 1. Pickands and known p x cases: bias (B) and mean-squared error (MSE) of the estimates 



k n (x) Bp x MSEp x B * (x) MSE ri(x) B n(x) MSE mx) 



x = 0.25. 


N x = 312, FDH: 


B 0l{x) = -0.012591, MSE 0l(x) 


— 000203 








77.7 


-0.25757 


784.19539 


-0.02585 


6.93961 





,00021 


0.00028 


74.4 


0.41215 


17.20703 


0.03723 


0.14471 





,00024 


0.00028 


71.0 


0.42344 


105.75775 


u.uooou 


0.89895 





,00016 


0.00028 


67.7 


0.44401 


16.30552 


0.03877 


0.11468 





00030 


00028 


64.4 


0.30552 


145.08207 


0.02564 


1.01166 





00031 


0.00029 


61.0 


0.68905 


35.13730 


0.05654 


0.24012 





,00053 


0.00029 


57.7 


0.82177 


15489.98302 


0.05929 


89.02353 





,00053 


0.00029 


54.3 


1.17914 


1780.66037 


0.08527 


9.90370 





,00055 


0.00029 


51.0 


-4.41384 


13169.38480 


-0.33207 


74.80129 





,00046 


0.00030 


47.6 


0.03147 


3204.61688 


-0.00179 


14.27123 





,00064 


0.00029 


x = 0.50, 


^ = 1250, FDH: B 0l{x) = - 




s — 000200 








312.1 


0.09248 


0.22503 


0.01696 


0.00735 





,00026 


0.00029 


297.0 


0.09311 


0.24340 


0.01668 


0.00759 





,00012 


0.00029 


281.9 


0.09124 


0.24958 


0.01595 


0.00742 


— 


,00001 


0.00029 


266.8 


0.09201 


0.27538 


0.01579 


0.00780 


-0 


,00009 


0.00029 


251.7 


0.08954 


0.29784 


0.01490 


0.00797 


-0 


,00042 


0.00030 


236.6 


0.09840 


0.33195 


0.01584 


0.00831 


-0 


,00049 


0.00030 


221.5 


0.11387 


0.38048 


0.01768 


0.00893 


-0 


00043 


0.00030 


206.3 


0.12297 


0.47557 


0.01840 


0.01038 


-0 


,00060 


0.00030 


191.2 


0.12060 


0.43562 


0.01720 


0.00881 


-0 


,00081 


0.00030 


176.1 


0.14573 


0.72946 


0.01989 


0.01371 


-0 


,00080 


0.00029 


x = 1.00, iV* = 5000, FDH: B 0l(x) = - 


-0.012663, MSE 0l{ 


x) = 0.000202 








1250.0 


0.02755 


0.04085 


0.01025 


0.00540 





,00078 


0.00028 


1188.0 


0.02863 


0.04254 


0.01047 


0.00537 





,00085 


0.00028 


1126.0 


0.02780 


0.04643 


0.00991 


0.00557 





,00065 


0.00029 


1064.0 


0.02689 


0.05068 


0.00953 


0.00575 





,00064 


0.00030 


1002.0 


0.02890 


0.05241 


0.00981 


0.00559 





,00061 


0.00029 


940.0 


0.02670 


0.05545 


0.00875 


0.00552 





,00032 


0.00029 


878.0 


0.02738 


0.06064 


0.00872 


0.00564 





,00029 


0.00029 


816.0 


0.02877 


0.06738 


0.00882 


0.00577 





,00024 


0.00028 


754.0 


0.03001 


0.07071 


0.00899 


0.00562 





,00037 


0.00028 


692.0 


0.03686 


0.07869 


0.01065 


0.00583 





,00065 


0.00029 



choice of k n (x). The moment estimators p x and <p(cc) have a much better behavior in 
terms of bias and MSE, and a greater stability with respect to the choice of k n (x), even for 
moderate sample sizes. When N x is very large (N x — 5000), p x and <p*(x) become more 
accurate than the moment estimators. On the other hand, the confidence intervals of p x 
constructed from the asymptotic distribution of p x provide more satisfactory results than 
those derived from the limit distribution of p x for large values of N x , say, N x > 1000. For 



Monotone frontier estimation 



1051 



Table 2. Pickands and known p x cases: average lengths (avl) and coverages (cov) of the 95% 
confidence intervals 



k„(x) 




cov^ 










x = 0.25, 


N x = 312 












77.7 


630.9019 


0.9040 


59.3041 


0.8925 


0.0670 


0.9455 


74.4 


18.4635 


0.9060 


1.6821 


0.8970 


0.0670 


0.9505 


71.0 


92.5814 


0.9000 


8.5104 


0.8960 


0.0670 


0.9480 


67.7 


18.6125 


0.8990 


1.5673 


0.8910 


0.0670 


0.9485 


64.4 


131.0169 


0.8910 


10.9372 


0.8845 


0.0670 


0.9525 


61.0 


37.9315 


0.8960 


3.1260 


0.8840 


0.0671 


0.9465 


57.7 


14491.7449 




1 PIQ8 9^78 


U .OOiJU 


U.Uu 1 1 




54.3 


1735.9675 


0.8930 


129.3070 


0.8820 


0.0671 


0.9430 


51.0 


13077.3352 


0.8910 


981.3170 


0.8805 


0.0671 


0.9440 


47.6 


3374.6016 


0.8925 


224.7041 


0.8735 


0.0672 


0.9410 


a; = 0.50, #3 = 1250 












312.1 


1.7798 


0.9295 


0.3232 


0.9195 


0.0670 


0.9485 


297.0 


1.8330 


0.9255 


0.3248 


0.9245 


0.0669 


0.9490 


281.9 


1.8810 


0.9250 


0.3247 


0.9240 


0.0669 


0.9475 


266.8 


1.9457 


0.9220 


0.3269 


0.9240 


0.0669 


0.9460 


251.7 


2.0095 


0.9200 


0.3279 


0.9145 


0.0668 


0.9505 


236.6 


2.1038 


0.9195 


0.3329 


0.9165 


0.0668 


0.9420 


221.5 


2.2256 


0.9150 


0.3409 


0.9100 


0.0668 


0.9390 


206.3 


2.3707 


0.9115 


0.3506 


0.9075 


0.0668 


0.9440 


191.2 


2.4375 


0.9105 


0.3468 


0.9085 


0.0667 


0.9455 


176.1 


2.7460 


0.9155 


0.3754 


0.9080 


0.0667 


0.9440 


x = 1.00, 


N x = 5000 












1250.0 


0.8019 


0.9645 


0.2909 


0.9605 


0.0670 


0.9540 


1188.0 


0.8238 


0.9625 


0.2914 


0.9595 


0.0670 


0.9555 


1126.0 


0.8463 


0.9535 


0.2914 


0.9495 


0.0670 


0.9425 


1064.0 


0.8707 


0.9510 


0.2915 


0.9445 


0.0670 


0.9435 


1002.0 


0.8994 


0.9530 


0.2922 


0.9455 


0.0670 


0.9475 


940.0 


0.9273 


0.9445 


0.2918 


0.9420 


0.0669 


0.9460 


878.0 


0.9614 


0.9420 


0.2923 


0.9450 


0.0669 


0.9420 


816.0 


1.0002 


0.9450 


0.2932 


0.9440 


0.0669 


0.9500 


754.0 


1.0426 


0.9475 


0.2939 


0.9460 


0.0669 


0.9550 


692.0 


1.0976 


0.9455 


0.2966 


0.9430 


0.0670 


0.9455 



inference purposes on the frontier function itself, the estimate of the asymptotic variance 
of the moment estimator tp(x) does not provide reliable confidence intervals, even for 
relatively large values of N x . In the latter case, it would be better to use the confidence 
intervals obtained from the asymptotic distribution of the Pickands estimator <p\ (x) . 
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Table 3. Moment Estimators: bias, MSE, average lengths and coverages 



k n (x) Bp^ MSEp x Bif>(x) MSE^ x) avlp x covp^ avl^( x) cov^ x) 



x = 0.25, N x = 312 



150.4 





.36520 


1 


.47278 


-0 


,04187 





,00339 


2 


,5969 





,8900 





,0869 





,3350 


137.9 





.35077 


1 


.86333 


-0 


03615 





,00337 


2 


,8243 





.8905 





,0939 





,3765 


125.3 





.33799 


1 


.26492 


-0 


,03080 





,00226 


2 


,7378 





.8990 





,0893 





.4435 


112.9 





.30315 


1 


.02334 


-0 


,02670 





,00173 


2 


,7495 





.9005 





,0874 





.4840 


100.4 





.27374 





.93872 


-0 


,02284 





,00139 


2 


,8414 





,8930 





,0873 





.5495 


87.9 





.28569 


1 


,22921 


-0 


,01810 





,00137 


.3 


,1695 





,8965 





,0936 





.5860 


75.4 





.30500 


9 


,96907 


-() 


,01330 


o 


,00806 


7 


,3693 


o 


,8865 


o 


,2075 


o 


.6340 


62.9 





.26381 


29 


,37920 


-0 


,01097 





,02156 


17 


,2434 





,8880 





,4629 





.6740 


50.5 





.51850 


18 


,67121 


-0 


00130 





,01090 


14 


,4349 





,8780 





,3524 





.7020 


38.0 





.53418 


21 


,11753 





,00124 





,00956 


18 


,2022 





,8645 





,3897 





.7225 


19.2 





.62323 


267 


,28452 





,00481 





,06789 


246 


,3768 





,8430 


3 


,8848 





.7525 


12.9 


-0 


.30491 


1266 


,44113 


-0 


,00977 





,30730 


1431 


,7282 





,8150 


22 


,2514 





.7315 


x = 0.50, N 


x = 1250 






























600.5 





.16644 





,16966 


-0 


,09657 





.01004 





,9860 





.8375 





,0645 





.0575 


550.5 





.16412 





,16874 


-0 


,08407 





.00776 


1 


,0281 





.8590 





,0667 





.0890 


500.4 





.16750 





,17596 


-0 


07212 





.00588 


1 


,0818 





.8735 





0691 





.1360 


450.5 





.17133 





,18419 


-0 


,06106 





.00440 


1 


,1442 





.8970 





,0715 





.2155 


400.5 





.16370 





,19777 


-0 


,05158 





.00334 


1 


,2099 





.9085 





0733 





.2945 


350.5 





.15716 





20738 


— 


,04270 





,00250 


1 


,2897 





.9225 





,0751 





.3815 


300.5 





.16437 





,23740 


-0 


.03370 





,00182 


1 


,4051 





.9335 





,0778 





.4775 


250.4 





.15151 





,25663 


-0 


,02649 





,00137 


1 


,5307 





.9430 





,0794 





.5650 


200.5 





.13915 





,28167 


-0 


,01987 





,00101 


1 


,7031 





.9415 





,0811 





.6475 


150.5 





.12971 





,36589 


-0 


,01373 





,00082 


1 


,9765 





.9305 





,0836 





.7180 


50.5 





.29865 


6 


,19391 





,00098 





,00356 


6 


,8895 





.8895 


0, 


,1734 





,8000 


13.0 


-0 


.58590 


9410 


,59672 


-0 


,01445 


1 


,57034 


10243 


,4270 





.8150 


131, 


,6029 





,7550 


x = 1.00, N x = 5000 






























2000.0 





.13502 





,05141 


-0 


,14729 





,02230 





,5207 





.7685 





0664 





,0000 


1800.0 





.13019 





,05132 


-0 


,12609 





,01649 





,5471 





.8140 





,0682 





,0025 


1600.0 





12099 





,04935 


-0 


,10701 





,01202 





,5765 





.8455 





,0697 





,0145 


1400.0 





.11212 





,05190 


-0 


,08930 





,00855 





,6129 





.8595 





,0712 





,0455 


1200.0 





.10555 





,05445 


-0 


,07261 





,00584 





,6593 





.8965 





,0727 





,1055 


1000.0 





.09393 





,05677 


-0 


,05771 





,00388 





,7168 





.9180 





,0740 





,2325 


800.0 





.07446 





,05965 


-0 


,04469 





,00251 





,7911 





.9245 





,0748 





,3680 


600.0 





.07713 





,07992 


-0 


03069 





,00148 





,9179 





.9310 





,0771 





,5615 


400.0 





.06905 





,10581 


-0 


,01877 





,00087 


1 


,1221 





.9415 





,0790 





,7255 


200.0 





.07559 





,20770 


-0 


,00744 





,00059 


1 


,6176 





.9365 





,0830 





,8375 


100.0 





.09821 





,49803 


-0 


,00225 





,00067 


2 


,4204 





.9095 





,0896 





,8465 


50.0 





.15884 


1 


,20953 





,00051 





,00083 


3 


,9082 





,8920 





,1034 





,8420 
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So, in terms of bias and MSE computed over the 2000 random replications, as well 
as the average lengths and the achieved coverages of the 95% asymptotic confidence 
intervals, the moment estimators of p x and <p(x) are sometimes preferable to the Pickands 
estimators and sometimes not. It is difficult to imagine one procedure being preferable 
in all contexts. Hence, a sensible practice is not to restrict the frontier analysis to one 
procedure, but rather to check that both Pickands and moment estimators point toward 
similar conclusions. However, when p x is known, we have remarkable results for (p\{x), 
even when N x is small, including remarkable properties of the resulting normal confidence 
intervals, with great stability with respect to the choice of k n (x). Recall that in most 
situations described thus far in the econometric literature on frontier analysis, this tail 
index p x is supposed to be known and equal to p + 1 (here, p x = 2): this corresponds 
to the common assumption that there is a jump of the joint density of (X, Y) at the 
frontier. 

This might suggest the following strategy with a real data set. If p x is known (typically 
equal to p + 1 if the assumption of a jump at the frontier is reasonable), then we can use 
the estimator (fil(x). If, on the other hand, p x is unknown, we could consider using the 
following two-step estimator: first, estimate p x (the moment estimator of p x seems the 
more appropriate, unless N x is large enough) and, second, use the estimator <fl(x), as if 
p x were known, by substituting the estimated value p x or p x in place of p x . In a situation 
involving a real data set, the best approach is not to favor the moment or the Pickands 
estimator of p x in the first step, but to compute (p\{x) by substituting in each of them, 
in the hope that the two resulting values of tp\(x) point toward similar conclusions. 

It should be clear that the two-step estimator (p\{x), obtained by substituting in p x , 
does not necessarily coincide with the Pickands estimator <p\(x), which is, instead, ob- 
tained by a simultaneous estimation of p x and <p(x). Indeed, in our Monte Carlo exercise, 
we have observed that the most favorable values of k n {x) for estimating p x and (f(x) are 
not necessarily in the same range of values. Thus, nothing guarantees that the selected 
value k n {x) when computing p x in the first step is the same as the one selected when 
computing ip\(x). Of course, when N x is very large, the two values of k n {x) are expected 
to be similar, but the idea in the two-step procedure is to use the asymptotic results of 
the more efficient estimator (p\ (x) and not those of ipl (x) . In the next section, we suggest 
an ad hoc procedure for determining appropriate values of k n (x) with a real data set. 

3.2. A data-driven method for selecting k n (x) 

The question of selecting the optimal value of k n (x) is still an open issue and is not 
addressed here. We will simply suggest an empirical rule that turns out to give reasonable 
estimates of the frontier in the simulated samples above. 

First, we have observed in our Monte Carlo exercise that the optimal value for selecting 
k n {x) when estimating the index p x is not necessarily the same as the value for estimating 
(fi(x). The idea is thus to select first, for each x (in a chosen grid of values), a grid of 
values for k n (x) for estimating p x . For the Pickands estimator p x , we choose k n {x) = 
[N x /A] — k + 1, where k is an integer varying between 1 and [N x /A], and for the moment 
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estimator p x , we choose k„(x) = N x — k, where k is an integer varying between 1 and N x . 
We then evaluate the estimator p x (k) (resp., p x {k)) and select the k where the variation 
of the results is the smallest. We achieve this by computing the standard deviations of 
p x {k) (resp., p x (k)) over a 'window' of 2 x [^N x /4\ (rcsp., 2 x [-\/AQ) successive values 
of k. The value of k where this standard deviation is minimal defines the value of k n {x). 

Wc follow the same procedure for selecting a value for k n (x) for estimating the frontier 
<f(x) itself. Here, in all of the cases, we choose a grid of values for k n {x) given by k = 
1, . . . , [\fN x ] and select the k where the variation of the results is the smallest. To achieve 
this here, we compute the standard deviations of (p\{x) (resp., <fi{x) and *f>{x)) over a 
'window' of size 2 x max(3, [\/N x /2Q\) (this corresponds to having a window large enough 
to cover around 10% of the possible values of k in the selected range of values for k n (x)). 
From now on, we only present illustrations for (p\{x) to save space. 

For a sample generated with n = 1000 in the uniform case, we get the results shown 
in Fig. 1. 

In Fig. 1, the estimator (p\{x) is first computed with the true value p x = 2 (top panel 
of the figure), then with a plug-in value of p x estimated by the Pickands estimator 
(middle panel) and finally with a plug-in value of p x estimated by the moment estimator 
(bottom panel). The pointwisc confidence intervals are also displayed. The three right- 
hand panels correspond to the same data set plus one outlier. This allows us to see how 
our robust estimators behave in the presence of outlying points, in contrast with the 
FDH estimator. In particular, due to the remarkable behavior of y?\(x) in the Monte 
Carlo experiment, if we know that p x = 2, then we should use the top panel results and, 
according to our suggestion at the end of the preceding section, if p x is unknown, we 
should use, in this particular example, the bottom panel results, where we replace p x by 
its moment estimator p x (since here N x < 1000) and continue as if p x were known. It is 
quite encouraging that the two panels are very similar. 

3.3. An application 

We use the same real data example as in [2], which undertook the frontier analysis of 
9521 French post offices observed in 1994, with X as the quantity of labor and Y as 
the volume of delivered mail. In this illustration, we only consider the n — 4000 observed 
post offices with the smallest levels Xi. We used the empirical rules explained above for 
selecting reasonable values for k n (x). The cloud of points and the resulting estimates are 
provided in Fig. 2. 

To save space, we only represent (f>i{x) when p x is supposed to be equal to 2 (left- 
hand panels) and when it is estimated by the moment estimator (right-hand panels) . The 
FDH estimator is clearly determined by only a few very extreme points. If we delete four 
extreme points from the sample (represented by circles in the figure), then wc obtain 
the pictures from the top panels: the FDH estimator changes drastically, whereas the 
extreme- value-based estimator (p\ (x) is very robust to the presence of these four extreme 
points. We also note the considerable stability of the various forms of the estimator <p*(x). 
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4. Concluding remarks 

In our approach, we provide the necessary and sufficient condition for the FDH estimator 
tpi(x) to converge in distribution, we specify its asymptotic distribution with the appro- 
priate convergence rate and provide a limit theorem for moments in a general framework. 
We also provide further insights and generalize the main result of [1] on robust variants 
of the FDH estimator, and we provide strongly consistent and asymptotically normal 
estimators p x and p x of the unknown conditional tail index p x involved in the limit 
law of (fii(x). Moreover, when the joint density of (X, Y) decreases to zero or increases 
toward infinity at a speed of power (3 X > — 1 of the distance from the boundary, as is 
often assumed in the literature, we answer the question of how p x is linked to the data 
dimension p + 1 and to the shape parameter /3 X . The quantity f3 x ^ describes the rate 
at which the density tends to infinity (in the case f3 x < 0) or to (in the case /3 X > 0) 
at the boundary. When (3 X = 0, the joint density is strictly positive on the frontier. We 
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Figure 2. The resulting estimator (p\{x) for the French post offices. We include four extreme 
data points (circles) for the bottom panels. From left to right, we have the cases p x — 2, substi- 
tuting in p x . 
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establish that p x = j3 x + (jp + 1). As an immediate consequence, we extend the previous 
results of [12, 14] to the general setting where p > 1 and /3 = /3 X may depend on x. 

We propose new extreme- value-based frontier estimators <p*(x), <p*{x) and <p[x), which 
are asymptotically normally distributed and provide useful asymptotic confidence bands 
for the monotone frontier function <p(x). These estimators have the advantage of not being 
limited to a bi-dimensional support and benefit from their explicit and easy formulations, 
which is not the case for estimators defined by optimization problems, such as local 
polynomial estimators (sec, for example, [10]). Their asymptotic normality is derived 
under quite natural and general extreme value conditions, without Lipschitz conditions 
on the boundary and without recourse to assumptions either on the marginal distribution 
of X or on the conditional distribution of Y given X = x, as is often the case in both 
statistical and econometrics literature on frontier estimation. The study of the asymptotic 
properties of the different estimators considered in the present paper is easily carried out 
by relating them to a simple dimensionless random sample and then applying standard 
extreme value theory (for example, [5, 6]). 

Two closely related works in boundary estimation via extreme value theory are [9], 
in which the estimation of the frontier function at a point x is based on an increasing 
number of higher order statistics generated by the Yi observations falling into a strip 
around x, and [8], in which estimators are instead based on a fixed number of higher 
order statistics. The main difference with the present approach is that Hall et al. [9] only 
focus on estimation of the support curve of a bivariate density (that is, p— 1) in the 
case p x > 1 (that is, the decrease in density is no more than algebraically fast), where it 
is known that estimators based on an increasing number of higher order statistics give 
optimal convergence rates. In contrast, Gijbels and Peng [8] consider the maximum of all 
Yi observations falling into a strip around x and an end-point type of estimator based 
on three large order statistics of the Yi 's in the strip. This methodology is closely related 
and comparable to our estimation method using the Pickands-type estimator, but, like 
the procedure of [9], it is only valid in the simple case p = 1 and involves, in addition 
to the sequence k n , an extra smoothing parameter (bandwidth of the strip) which also 
needs to be selected. Moreover, the asymptotic results in [8] are provided for densities 
of (X, Y) decreasing as a power of the distance from the boundary, whereas the setup 
in our approach is a general one. Also, note that our transformed dimensionless data 
set [Zf , . . . , Z%) is constructed in such a way as to take into account the monotonicity 
of the frontier (the end-point of the common distribution of the Zf 's coincides with the 
frontier function ip(x)), the univariate random variables Zf do not depend on the sample 
size and they allow the available results from standard extreme value theory to be easily 
employed, which is not the case for either of [8, 9]. 

It should be clear that the monotonicity constraint on the frontier is the main difference 
with most of the existing approaches in the statistical literature. Indeed, the joint support 
of a random vector (X, Y) is often described in the literature as the set {(x, y) \y < 4>(x)}, 
where the graph of <fi is interpreted as its upper boundary. As a matter of fact, the 
function of interest, tp, in our approach is the smallest monotone non-decreasing function 
which is greater than or equal to the frontier function <p. To our knowledge, only the 
estimators FDH and DEA estimate the quantity tp. Of course, <f> coincides with ip when 
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the boundary curve is monotone, but the construction of estimators of the end-point 
4>{x) of the conditional distribution of Y given X = x requires a smoothing procedure, 
which is not the case when the distribution of Y is conditioned by X < x. 

We illustrate how the large-sample theory applies in practice by carrying out some 
Monte Carlo experiments. Good estimates of ip(x) and p x may require a large sample 
of the order of several thousand. Theoretically selecting the optimal extreme conditional 
quantiles a (k n (x)) f° r estimating ip(x) and/or p x is a difficult question that is worthy of 
future research. Here, we suggest a simple automatic data-driven method that provides 
a reasonable choice of the sequence {k n (x)} for large samples. 

The empirical study reveals that the simultaneous estimation of the tail index and of 
the frontier function requires large sample sizes to provide sensible results. The moment 
estimators of p x and of tp(x) sometimes provide better estimations than the Pickands 
estimates and sometimes not. When considering bias and MSE, <p(x) and p x provide 
more accurate estimations, but when the sample size is large enough, (p\{x) and p x 
significantly improve and even seem to outperform the moment estimators. As far as the 
inference on p x is concerned, p x also provides quite reliable confidence intervals, but p x 
provides more satisfactory results for sufficiently large samples. However, when inference 
about the frontier function itself is concerned, the moment estimator provides very poor 
results compared with the Pickands estimator. 

On the other hand, the performance of the estimator (p\(x), computed when p x is 
known, is quite remarkable, even compared with the popular FDH. The confidence inter- 
vals for tp(x) are very easy to compute and have quite good coverages. In addition, the 
results arc quite stable with respect to the choice of the 'smoothing' parameter k n (x). 
As shown in our illustrations, the estimates also have the advantage of being robust to 
extreme values. This suggests, even if p x is unknown, the use of a plug-in version of (p\(x) 
for making inference on tp(x): here, in a first step, we estimate p x (using the moment 
estimator, unless N x is large enough), then we use the asymptotic results for ip*(x), as 
if p x was known. A sensible practice is not to restrict the first step to one procedure, 
but rather to check that both Pickands and moment estimators point toward similar 
conclusions. 

Appendix: Proofs 

Proof of Theorem 2.1. Let Z x = Yt{X <x) and F m {-) = {1 - F x (x)[l - F(-\x)]}t(- > 
0). It can be easily seen that ¥{Z X <y) = F x (y) for any y <E E. Therefore, {Zf = Y l l(X l < 
x), i = 1, . . . , n} is an i.i.d. sequence of random variables with common distribution func- 
tion F x . Moreover, it is easy to see that the right end-point of F x coincides with <p{x) 
and that max^i,.. „ Zf coincides with <pi(x). Thus, assertion (i) follows from the Fisher- 

Tippett theorem. It is well known that the normalized maxima b^ n 1 (ipi(x) — <p(x)) — — > G 
(that is, F x belongs to the domain of attraction of G = * Pn . ) if and only if 



F x (<p(x)-l/t)eRV- Pls , 



(A.l) 
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where F x = 1 — F x . This necessary and sufficient condition is equivalent to (2.2). In this 
case, the norming constant b n can be taken to be equal to <p(x) — inf{y > | F x {y) > 
1 — i} = ip(x) ~ inf{y > | F{y\x) > 1 — nF ^( x ) }, which gives assertion (ii). For assertion 

(hi), since (A.l) holds and E[|Z a: | fc ] = F x (x)E(Y k \X <x)< ip(x) k , it is immediate (see 
[16], Proposition 2.1) that lim^oo E{&" 1 (^i (x) - <p(x))} k = (-l) fc r(l + k/p x ). Likewise, 
the last result follows from [16], Corollary 2.3. □ 

Proof of Corollary 2.1. Following the proof of Theorem 2.1, we can set b n = ip(x) — 
F^i 1 - where F x l {t) = ini {2/ G ]0, f(x)] : F x {y) > t} for all t G ]0, 1]. It follows from 
(2.3) that F-^t) = (p(x) - ((1 - t)/4) 1/Px as 1 1 1 and so b n = for all n 

sufficiently large. □ 

Proof of Corollary 2.2. Under the given conditions, it can be easily seen from (2.3) 
that 



f(x,y) = Mx)~yr- (p+1) 



d d 
txPx(Px - 1) • • ■ (px - p)ttj<p(x) ■ ■ ■ -^f(x) + o(l) 



as yt<p(x), 



dx 1 dx p 

where the term o(l) depends on the partial derivatives of x >— > l x , x i— > p x and x i— > <p(x). □ 



For the next proofs, we need the following lemma whose proof is quite easy and is thus 
omitted. 



Lemma 1. Let Z?^ < ■ ■ ■ < Zf, be the order statistics generated by the random variables 

ZX ry X . 

1 ) ' * * ) 71 ' 

(i) IfFx(x) > 0, then <P l _ k/{ nP x { x )){ x ) = ^f n _ fc ) for each k G {0, 1, . . . , nF x (x) - 1}. 

(ii) For any fixed integer k > 0, we have <Pi-k/(nF x (x))( x ) = Z? n -k) as n ~~ ^ 00 > w ^ 
probability 1. 

(hi) For any sequence of integers k n >0 such that k n /n — »• as n — > oo, ioe Ziaue 
^i-fc n /(nF x (i))( x ) = as 71 °°5 ™^ Probability 1. 

Proof of Theorem 2.2. (i) Since <p(x) = F 2 T 1 (1) and <£i(a:) = for all n > 1, 

we have (<£i(x) — y(a;)) = — F" 1 ^].)). Hence, if b^ 1 (ipi(x) — (p(x)) — > G x , then 

b~ 1 (Z^ — F~ l (l)) converges to the same distribution G^. Therefore, following [18], 

Theorem 21.18, b- l (Z? k) - F' 1 ^)) A H x for any integer k > 0, where if x (y) = 

Gx(2/)ELo(- lo g G (y)) l A ! - Finally, since Zf n _ k] a = ^i_ fc/( „# x(x)) (a;) as n-> 00, in view 

of Lemma 1(h), we obtain &,T 1 (^ 1 _ fc/( „_F x(K)) (s) - ^(l)) — > #*• 

(ii) Writing b~ 1 (y> Q (a;) - y>(jc)) = &~ 1 (p a (a:) - +^ 1 (^i(a;) - <p(x)), it suffices 

to find an appropriate sequence a = a n — > 1 such that 6" 1 (<^ a „ (a;) — </H (a;)) — — > 0. Aragon 
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et al. [1] (sec equation (20)) showed that \<p a {x) — ^i(^)| < (1 ~ &) n Fx(x)Fy 1 (1), with 
probability 1, for any a > 0. It thus suffices to choose a = a n — > 1 such that nb~ x {l — 
a„)->0. □ 

Proof of Theorem 2.3. (i) Let j x = —l/p x in (A.l). The Pickands [15] estimate 
of the exponent of variation j x < is then given by % := (log2) _1 log{(Z? l _ fc+1 ^ — 
)/( Z fn- 2k+i) ~ ^fn-4fc+i))j'' Under (2.2), Condition (A.l) holds and so there 
exists b n > such that lim, woo P[b~ 1 (Z^ — <p(x)) <y] = l I'_i/ 7x (y). Since this limit is 
unique only up to affinc transformations, we have 

Urn F[c-\Zf n) - d n ) <y] = *_y 7 .(-7xl/ - 1) = exp{-(l + lx y)~ lh -} 

n— too v ; 

for all y < 0, where c„ = —"f x b n and d n = ip(x) — b n . Thus, condition (1.1) from Dckkcrs 
and de Haan [5] holds. Therefore, % — > lx if k n — > oo and ^ — > 0, in view of [5], Theorem 
2.1. This gives the weak consistency of p x since j x a = ' — l/p x as n — > oo, in view of 
Lemma l(iii). 

(ii) Likewise, if ^ — )• and i og ^o g ~ — > oo, then 7^ 7^ via [5], Theorem 2.2, and so 

a.s. 
Px > Px- 

(hi) We have U(t) = inf{j/ > | 1 _ J ^ ^ > i}, which corresponds to the inverse function 
(1/(1 -Fx))' 1 ^). Since ±t 1 -^U'(t) € 11(A) with lx = -l/p x < 0, it follows from [5] (see 
Theorem 2.3) that V^(%- lx ) ^ M(0,a 2 ( lx )) with a 2 ( lx ) = 1 2 x {2 2 ^+ 1 + 1)/{2(2^ - 
1) log2} 2 for k„ -> 00 satisfying fc„ = o(n/ff -1 (ra)), where #(t) := ^ 3 - 2 T-{[/'(i)/A(f)} 2 . By 
using the fact that y/k^{p x — p x ) a == -\/kn(— J- + — ) as n — > 00, in view of Lemma l(iii) 

and applying the delta method, we conclude that y/k^(p x — p x ) — — > A/"(0, cr 2 (/j x )) with 
asymptotic variance a 2 (p x ) — c 2 (7a;)/7^- 

(iv) Under the regularity condition, we have ±{t^ 1 ~ 1 / lx F x {Lp{x) — 7) — (5i r x(a^)} G 
i?U_ K . The conclusion then follows immediately from Theorem 2.5 of [5] in conjunction 
with Lemma 1 (iii) . □ 

Proof of Theorem 2.4. We have, by Lemma l(iii), that for each j = 1,2, 
fc-i 

M« = (l/fc)X>S^(U) - ^ Z (n- k) y as n -> 00, with probability 1; (A.2) 

i=0 

— l/p x then coincides almost surely, for all n large enough, with the well-known moment 
estimator j x (given by [6], equation (1.7)) of the index defined in (A.l) by j x = —l/p x . 
Hence, Theorem 2.4(i) and (ii) follow from the weak and strong consistency of 7^ proved 
in [6], Theorem 2.1. Likewise, Theorem 2.4(iii) follows by applying [6], Corollary 3.2, in 
conjunction with the delta method. □ 

Proof of Theorem 2.5. (i) Under the regularity condition, the distribution function F x 
of Z x has a positive derivative F' x {y) = Fx(x)F'(y\x) for all y > such that F^(ip(x) — 
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J (n-fc„ + l) 
Z(n-k n + l) _ ^(n-2fc„ + l) 



is asymptotically normal with mean zero and variance 2 27a:+1 7^/(2 7a: — l) 2 . We conclude 
by using the facts that F~ (1 — p n ) = ipx-p n l f x {x)(. x ) and 



.Z 



(n-fc„ + l) Fx Pn) 

yx yx 

^(n-fe„+l) (n-2fc„+l) 



2h r 



< Pl-(k n -l)/(nF x (x))\ X ) ~ ^l-(2k n -l)/(nF x (x))( X ) 



as n — > oo. 



(ii) We have (x) 
orcm 3.2, 



■Zf 



as n — > oo. Following [5], The- 



2"fa=— l 1 "(n— k n +i) 

-y/2fc^(ffi(x) - ip(x)) 

yx yx 

^(n-fc„+l) Zy (ri-2fc„ + l) 

is then asymptotically normal with mean zero and variance 37 2 2 27a!_1 /(2 7a = — l) 6 . 

(iii) Let E^ < ■ ■ ■ < Er n \ be the order statistics of i.i.d. exponential variables 

E u ...,E n . Then, {Zf n _ k+1) }% =1 = {U(e E ^-«+V)}l =1 . Writing V{t) :=£/(e 4 ), we obtain 



"(n-fc„ + l) 



fix) 



o— 7 X i yx _ yx 

* 1 a \n-k n +\) Z/ (n-2fc„ + l) 



2h r 



1 



'2k 



2<k n 



V(E(n-k n + l)) - <f{x) 

2-^ - 1 V(E (n _ kn+1) ) - V(E {n ^ 2kn+1) ) 
V{oo)-V{\ogn/{2k n )) , 1 ' 



V'(logra/(2fc„)) 7 , 
7(S ( „_ fen+ i)) - V(£ ( „_ 2fen+ i)) 1 - 2-t- 



2^V'(E {n _ 2kn+1) ) 



2 7 * V'(E {n ^ 2kri+1) ) 
1-2-f* V'{\ogn/(2k n )) 



V2k^ f V'(E {n _ 2kn+1) ) V(E {n _ kn+1) )-V(\ogn/(2k n )) 
lx \V'(logn/(2k n )) lx V'(logn/(2k n )) 

V'(\ogn/{2k n )) 

V(E(n-k„+l)) — V{E( n -2k n +l)) 

The first term on the right-hand side tends to zero as established by Dekkers and de Haan 
([5], Proof of Theorem 3.2). The second term converges in distribution to 7V(0, 1) x , 
in view of Lemma 3.1 and [5], Corollary 3.1. The third term converges in probability to 
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2^-1 by the same Corollary 3.1. This ends the proof of (iii), in conjunction with the 
fact that 



1k n 



(p\(x) - ip(x) 



^l-(fc n -l)/(nF x (x))( X ) <Pl-(2k n -l)/(nF x (x))( X ) 

1 Zf n , +1) -(p(x) 



r7X 

_ , Z '(n-fe„ + l) 

2-T- - 1 ^f n _ fcn+1 ) - Z ( n -2k n + l) 



as n — > oo, 



with probability 1. 



□ 



Proof of Theorem 2.6. Write F x (y) := F x (x)[l - F(y\x)] and F x (y) := 1 - F x (y) for 
all y > 0. Let R x (y) '■= — log{F x (y)} for all y € [0, <£>(x)[ and let E( n _ kn+1 ) be the statistic 
of order n — k n + 1 generated by n independent standard exponential random variables. 
Z? n -k n +i) then has the same distribution as R~ 1 [E^ n _ kn+ i' j ], where R^it) := inf{y > 
| R x {y) >t} = inf{y > | F x (y) > 1 - e" 4 } := F x \l - e"*). Hence, 



yx 



FZ 



n 



Rx 1 [ E (n-k n + l)} - R x 1 



log 



fan 



(n-fc n +l) 



(n-fe„ + l) 



log 



(^7 
n 2 



log -r- 



(-R*7'N, 



provided that £' (n _ feii+1) A log(n/fc„) < <5„ < £( n _ fcn+ i) V log(n/fc„). By the regularity 
condition (2.3), we have that = (p(x) — (e _i / ' l x ) x ^ lx for all t large enough. There- 

fore, for all n sufficiently large, 

{pxkl/yikJni^/^jiZ^^ - F x \l - k n /n)] 

= kl/ 2 [E {n . kn+1) -\og(n/k n )] 

-{ki/ 2 /2p x }[E 

(n-fc„ + l) — 

log(n/fc„)] exp{-[<5„ - \og(n/k n )]/p x }. 
Since kl_ /2 [E (n _ kn+1) - log(n/fc„)] 4- 7V(0,1) and |<5„ - log(n/fc„)| < \E {n _ kn+1) - 



log(n/A; n )| A as n — > oo, we obtain {p 2; fc I 1 / 2 /(fc„/n£ 2 ;) 1 / px }[Z ( a: r 
kn/n)] — >JV(0, 1) as n 



(n-fc„ + l) 



enough, we have — (1 



Since F" 1 ^) = <p(x) - ((1 - t)/4) 1/p * for all t < 1 large 



(k n /n£ x ) 1 / p * for all n sufficiently large. Thus, 



{pxfci /2 /(fcn/«4) 1/fe } x [^f n _ fen+1) + (fc„/n4) 1/fe - (f{x)] 4^(0,1) as n -> oo. We 



conclude by using the fact that Zf_ k +1 j =' <p 1 _( k _i)/(„# x ( x )) ( x ) as oo. 



□ 
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Proof of Theorem 2.7. (i) As shown in the proof of Theorem 2.5(i), we have F' x (ip{x) — 
\) G RVx+i/^. Then, by applying Dekkers et al. [6], Theorem 5.1, in conjunction with 
(A. 2), we get 

The proof is completed by simply using the fact that F~ (1 — p n ) = <px—p n /(F x {x)}{ x ) 

and Z (n-k n ) =' <Pl-k n /(nF x (x))( X ) n->00. 

(ii) Since Z^ n _ kn) =' <^i_ ferl /(„# x ( x ))(a:) and j x =' -l/p x asti->oo, we have <p(x) a = 
Z* n _ k jMn (1 — 1/7? ) + Z* n _ k j as n — > oo. It is then easy to see from (A. 2) that (p{x) 
coincides almost surely, for all n large enough, with the end-point estimator x* of i 7 ' x " 1 (l) 
introduced by [6], equation (4.8). It is also easy to check that U(t) = (1/(1 — F^)) -1 ^) sat- 
isfies the conditions of [6], Theorem 3.1, with -f x = —l/p x < 0. According to [6], Theorem 

5.2, we then have VK{x* - F x x {!)} / Z x (n _ kn] {\ - %) A M(0,V 5 (-l/ lx )), which 
gives the desired convergence in distribution of Theorem 2.7(ii) since F^ 1 (l) = (p(x), 
x* n =• tp(x), % =' -l/p x and Z^ n _ kn) a =' Vi- kn /(nF x (x))( x ) as n-> oo. □ 
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